Reordering edges and elements in unstructured meshes to reduce execution time in Finite Element Computations

Reverse Cuthill McKee (RCM) reordering can be applied to either edges or elements of unstructured meshes (triangular/tetrahedral), in accordance to the respective finite element formulation


Introduction
The finite element method is one of the most popular general purpose techniques for computing accurate solutions to partial differential equations (pdes).Since pdes form the basis for many mathematical models in the physical sciences and other fields, it is not hard to realize the importance of the finite element method.The finite element method reduces a boundary value problem for a partial differential equation or system of pdes to a system of linear equations, written in a matrix form KU f  that can be solved numerically.In many cases the stiffness   K matrix is symmetric and sparse, with the sparse structure highly influenced by the node (nodal finite element method [21], edge (edge elements [10]), and element (discontinuous finite element method [4,3]) numberings provided by the grid generator.
Here, sparse matrix techniques are preferable since the storage required increases as   0 N where N denotes the degrees of freedom of the problem.Storage can be reduced by storing only the nonzero elements with a compress sparse row format.Sparse matrix-vector multiplication is an important kernel in many iterative solvers; with parallel implementations We can obtain a solution in a reasonable amount of time but they suffer from low cache utilization due to unstructured data access patterns.[7,2] proposed to reorder sparse matrices using a bandwidth reduction technique in order to reduce the number of cache misses to improve the memory-system performance of the sparse matrix-vector multiplication.Sometimes it is not easy to assemble the stiffness matrix and reorder its elements since the matrix is assembled at running Nº 20, Vol. 10 (1), 2018.ISSN 2007 -0705, pp.: 263 -279 -265 -time, or if it is too large it must be split among different processors.Here we discuss how to use RCM to reorder nodes, edges and elements of the mesh in order to reduce the bandwidth of the assembled matrices.The work is organized as follows: section 2 introduces the edge-and elementdefined finite element formulations employed, in section 3 the different meshes used in our numerical computations and the influence of the edge and element ordering in the sparse structure of the finite element matrices are presented, then section 4 introduces the RCM renumbering of edge and elements graphs, with some numerical experiments also included, and finally section 5 establishes the conclusions of this work.

Edge-element formulation
For the edge-element formulation let us consider the problem of calculating resonant frequencies of cavities ([8, 1]).The nodal formulation is: Where ˆˆx ,  and r  are the material permittivity and permeability respectively.
The finite element formulation is given by The electric field in a single triangular/tetrahedral element is represented as

Discontinuous formulation
For discontinuous formulation ([4], [3]) we consider the conservation laws the approximate solution is sought in the finite element space of discontinuous functions Pk is the local space of polynomials of degree k  .The weak formulation of (4) is Where  is a smooth function, n is the outward unit normal to face e of elementT .In the above expression we replaced u by its approximation k u , the first two terms correspond to volume integrals, and the last term requires the evaluation of numerical fluxes.

Meshes
The most popular element shapes employed for two and three dimensional applications are triangles and tetrahedra respectively; this is due to the fact that they are the simplest tessellation shapes for modeling arbitrary two and three dimensional geometries and they are also well suited for automatic mesh generation.To investigate the influence of the edge and element numbering in the structure of the stiffness matrix, we employ the grid generators Triangle [18] and Tetgen [19], which are two and three dimensional mesh generators that use Ruppert's Delaunay refinement.
These grid generators, in addition to providing nodes and elements, compute edges and neighbors of the elements (which is very convenient for element-defined and edge-defined finite element formulations).
Nº 20, Vol. 10 (1), 2018. ISSN 2007 -0705, pp.: 263 -279 -267 - We consider two simple geometries; for the two-dimensional case a circle with three interior circles removed, and for the three-dimensional case, a cylinder with three interior cylinders removed.An unstructured triangular mesh is generated by Triangle, whereas an unstructured tetrahedral mesh is obtained with Tetgen.Figures 1-2 show with a few elements the meshes considered.Also Figure 3 shows a two-dimensional view of the three-dimensional domain.
Table (1) shows the information for the meshes.Here n-nodes, n-elements and n-edges denote number of nodes, elements and edges respectively.In this section we discuss how the numbering of edges and elements affect the sparse structure of the stiffness matrix.
The structure of the three-dimensional case is discussed in detail and some notes are stated for the two-dimensional case.Let us introduce the following 3d sample mesh.
At [14] the local number of edges is specified and also more implementation details are included.
Finally in an element wise finite element formulation (finite volume or high order discontinuous Galerkin); we need the neighbors of each element.Table 4 shows the element to node connectivity Table with their four neighbors (three for triangles) for the 3d sample mesh.Here a zero means that the element does not have a neighbor (its edge/face is on the boundary).From the element to neighbors connectivity Table we can get the structure of the finite element matrix K in the discontinuous case; a loop over the rows of the Table says that for the row i the matrix where of course we consider only the nonzero elements of the neighbors connectivity Table and again we take into account the elements of the diagonal and the symmetries of the matrix.Note that the numbering of edges and elements is given by the order as they appear at the files provided by the grid generators.16 2 3 5 0 2 4 1 6 3 4 6 0 3 1 1 6 4 5 7 0 4 2   1 6 5 2 8 0 1 3   3 6 2 7 8 0 6

Edge renumbering
In ([13, 14]) two edge-numbering schemes provided by Jin [8] were tested for two-and threedimensional implementations respectively.Numerical experiments there showed that if the node labels are close enough, a good edge-numbering scheme (low bandwidth) can be defined by using an indicator to each edge (the indicator was 12 ii  with 1 i and 2 i denoting the end nodes of the edge) with the array of indicators rearranged by a sorting algorithm very similar to the edge-reordering scheme used in [6].In this work we use a different approach.Our starting point is any edge numbering, for example the one provided by Tetgen or Triangle.
To fix the ideas, let us refer to our 3d sample mesh from Figure 4.The last six (three for triangles) columns of element-to-edge connectivity Table (3

Elements Renumbering
For the discontinuous formulation, in addition to the description of the elements by the connectivity of their nodes, a list of the neighbors of each element is required.Both grid generators Triangle and Tetgen provide the neighbors for triangles and tetrahedra respectively.In our 3d sample mesh, the last four (three for triangles) columns of Table (4) define the connectivity of what we call the Element-Graph.This is our main contribution because once we have defined the Element-Graph we can apply the renumbering algorithm to its connectivity Table .Notice that we can generate its graphical representation by joining the barycenters of the elements of the original node graph.
Figure 5 shows this graph.

Numerical Experiments
We use meshes generated by the grid generators Triangle and Tetgen to assemble the matrices for the edge and discontinuous formulations given in section 2.
Let us make some comments about the edge-element formulation.Figure 6 shows a sparse matrix structure obtained by using the original edge-numbering provided by the grid generator [14], whereas the matrix structure showed at Figure 7 was assembled using a node-reordered mesh (RCM) followed by the S1 edge-numbering scheme, and finally the matrix structure showed in S1 is a good edge-numbering provided the nodes are close enough numbered, but the best result is obtained by directly applying the RCM reordering to the Edge-graph.In fact, after using a nodereordered mesh with the edge-numbering S1 the original bandwidth was reduced in a 332%, but when a RCM-renumbering is directly applied to the edge graph the bandwidth is reduced in a 610%.
As it has been widely reported, RCM applied to nodes of a mesh reduces the bandwidth of assembled matrices in nodal finite elements formulation.In order to obtain a bandwidth reduction in finite volume or discontinuous Galerkin formulations, we defined the Element Graph and applied the RCM reordering of elements.Figures 9 shows the sparse matrix structure for the elementdefined formulations for both the original tetrahedral mesh and the graph element RCM reordered.We observed similar results in two and three dimensions for nodes and elements.
Figure 9 shows the sparse structure of the matrix of the element wise finite element formulation for the original numbering of the elements and the RCM reordering of the elements graph.

Conclusions
Bandwidth reduction of matrices produced by nodal edge and element-defined finite element formulations were obtained.
To apply a renumbering scheme like RCM (originally designed for nodal) we have defined the N and Element-Graph.The optimal mesh obtained (node-reordered, edge-reordered, and elementreordered) can be very valuable.Some methods, for example those that combine nodal and edgedefined finite element formulations [9] or those that combine nodal finite element and finite volume methods [16] could directly benefit from these reorderings.We suggest that grid generators should incorporate nodal, edge and element renumbering of the produced meshes in order to reduce bandwidth of assembled finite element matrices, avoiding the preprocessing of the mesh or the post processing of the assembled finite element matrix (sometimes quite difficult because the matrix is assembled at execution time or distributed over different processors).Moreover, the bandwidth minimization renumbering scheme that has been discussed reduces the cache misses improving the execution time in the sparse-matrix-times-vector product implemented on OpenMp.Matrix-vector multiplication is one of the main kernels in the iterative solvers.In this work we have discussed first order nodal and (CT/LN) edge finite elements, and a future work is due to investigate high order nodal and edge elements beside extended stencil finite volume formulations (extended neighborhoods for high order finite volume formulations).

Figure 1 .
Figure 1.Two dimensional mesh for computational calculations.

Figure 2 .
Figure 2. Three dimensional mesh for computational calculations mesh.

Figure 3 .
Figure 3. Two dimensional view of the three dimensional domain for computational calculations

For
triangles there are three nodes, a boundary marker and its three neighbors.For tetrahedral: four nodes, a boundary marker and its four neighbors.Finally n edges lines that contain the nodes that define each edge.Figure (4) shows the 3d simple mesh with its nodal, edge and element numberings.

Figure 4 .
Figure 4. Three dimensional sample mesh with nodal, edge and element numberings.
reduction is obtained by a reordering of nodes of the finite element mesh.Some of the most common used algorithms include: Reverse Cuthill McKee [5], and Gibbs-Poole [15].TRIANGULATION RCM and TET MESH RCM are programs which compute the reverse Cuthill-McKee reordering for nodes in triangular/tetrahedral meshes; they are freely available in C++, FORTRAN90 and MATLAB versions [12].The nodal renumbering scheme is directly applied to the node graph.It is desirable that grid generators incorporate node renumbering algorithms.
) define the connectivity of what we Nº 20, Vol. 10 (1), 2018.ISSN 2007 -0705, pp.: 263 -279 -273 -call the Edge-Graph.A graphical representation of this mesh can be generated by joining the middle points of edges of the original Node-Mesh, Figure 5 shows this graph.This is our main contribution; because once we have defined the connectivity of the Edge-Graph we proceed to apply a renumbering algorithm as RCM.
Figure8was assembled using the RCM applied to the Edge-Graph.As mentioned in [14], scheme

Figure 6 .
Figure 6.Original edge formulation, as provided by the grid generator.

Figure 11 .
Figure 11.Matrix times vector execution time as a function of the number of iterations, elements formulation.
the length of edge m that connects nodes 1 m .Some other applications of edge finite elements in electromagnetics include wave propagation in both closed and open structures, such as metallic waveguides, open and shielded microstrip transmission lines, and optical waveguides or fibres.
This mesh file contains information of nodes, elements and edges.It can be obtained from the node, Nº 20, Vol. 10 (1), 2018.ISSN 2007 -0705, pp.: 263 -279 -269 -element, edges, and neighbors files provided by the grid generator Tetgen (two-dimensional versions of these files can be obtained by using Triangle).Its first line contains n-nodes, n-element and n-edges (number of nodes, elements and edges respectively), the following n-nodes lines contain the nodes coordinates (two or three), and the next n-elements lines contain the element node connectivity (nodes that define each element).
Table provided by the grid generator we can get the structure of the finite element matrix K .A loop over the n elements rows of this Table says that for the row i the matrix K has nonzero elements on

Table 2 .
Element to node connectivity Table of our 3d sample mesh.

Table 3 .
Element to edge connectivity Table of our sample mesh.
Now for the edge-element formulation with linear elements, the finite element matrix can be built in a similar way as it was built in the nodal case.Here an element-to-edge Table is required.Table3shows the element-to-edge connectivity Table for the sample mesh.The last six columns contain the numbering of each element-edge.A loop over the n-elements rows of this element-edges Tablesaysthat for the row i the matrix K has nonzero elements on Nº 20, Vol. 10 (1), 2018.ISSN 2007 -0705, pp.: 263 -279 -271 -

Table 4 .
Element to neighbors Table of our sample mesh.